Multidimensional scaling of noisy high dimensional data
نویسندگان
چکیده
Multidimensional Scaling (MDS) is a classical technique for embedding data in low dimensions, still widespread use today. In this paper we study MDS modern setting - specifically, high dimensions and ambient measurement noise. We show that as the noise level increases, suffers sharp breakdown depends on dimension level, derive an explicit formula point case of white then introduce MDS+, simple variant MDS, which applies shrinkage nonlinearity to eigenvalues similarity matrix. Under natural loss function measuring quality, prove MDS+ unique, asymptotically optimal function. offers improved embedding, sometimes significantly so, compared with MDS. Importantly, calculates dimension, into should be embedded.
منابع مشابه
High Performance Multidimensional Scaling for Large High-Dimensional Data Visualization
Technical advancements produces a huge amount of scientific data which are usually in high dimensional formats, and it is getting more important to analyze those large-scale high-dimensional data. Dimension reduction is a well-known approach for high-dimensional data visualization, but can be very time and memory demanding for large problems. Among many dimension reduction methods, multidimensi...
متن کاملData Visualization With Multidimensional Scaling
We discuss methodology for multidimensional scaling (MDS) and its implementation in two software systems, GGvis and XGvis. MDS is a visualization technique for proximity data, that is, data in the form of N × N dissimilarity matrices. MDS constructs maps (“configurations,” “embeddings”) in IRk by interpreting the dissimilarities as distances. Two frequent sources of dissimilarities are high-dim...
متن کاملMultidimensional Scaling and Data Clustering
Visualizing and structuring pairwise dissimilarity data are difficult combinatorial optimization problems known as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data select...
متن کاملThe noisy multidimensional scaling problem: an optimization approach
Multidimensional scaling is a fundamental problem in data analysis and have a lot of applications. It’s goal is to look for an Euclidean graphic representation of a given set of data in a “low’ dimensional space (generally in IR or IR). This problem can be formulated as a nonlinear global optimization problem. To solve it, a Lenvenberg-Marquardt method is used upon different cost functions. Res...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied and Computational Harmonic Analysis
سال: 2021
ISSN: ['1096-603X', '1063-5203']
DOI: https://doi.org/10.1016/j.acha.2020.11.006